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A NOVEL GENE AND USES THEREFOR-IIb 
FIELD OF THE INVENTION 

5 

The present invention relates generally to a novel human gene and to derivatives and mammalian, 
animal, avian, insect, nematode, and microbial homologues thereof. The present invention 
further provides pharmaceutical compositions and diagnostic agents as well as genetic molecules 
useful in gene replacement therapy and recombinant molecules useful in protein replacement 
10 therapy. 

Bibliographic details of the publications referred to by author in this specification are collected 
at the end of the description. Sequence Identity Numbers (SEQ ID NOs.) for the nucleotide and 
amino acid sequences referred to in the specification are defined after the bibliography. 

15 

BACKGROUND OF THE INVENTION 

The increasing sophistication of recombinant DNA technology is greatly facilitating research and 
development in the medical and allied health fields. There is growing need to develop 
20 recombinant and genetic molecules for use in diagnosis, conventional pharmaceutical 
preparations as well as gene and protein replacement therapies. 

In work leading up to the present invention, the inventors sought to identify and clone human 
genes which might be useful as potential diagnostic and/or therapeutic agents. One area of 
25 particular interest is in the field of signal transduction. 

Knowledge of cellular interaction in the control of cell proliferation is essential in the rational 
design of specific therapeutic strategies aimed at controlling proliferative disorders. Such 
proliferative disorders including a range of cancers, inflammatory conditions and atherosclerosis. 
30 An important asp>ect of cellular interaction is in signal transduction via receptors to intracellular 
transducers. One key signal transducer is Ras which couples the receptors for diverse 
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extracellular signals to different effectors. Ras directly activates the downstream kinase Raf 
which in turn induces the mitogen activated protein kinase (MAPK) cascade. 

The Ras is an example of a guanine nucleotide exchange factor (GEF). A mutation in a GEF 
5 such as Ras has been implicated in development of a range of cancers and tumours. There is a 
need, therefore, to identify new GEFs and to develop therapeutic and diagnostic protocols based 
on modulating function of the GEF singalling pathways. 

SUMMARY OF THE INVENTION 

10 

Throughout this specification, unless the context requires otherwise, the word "comprise", or 
variations such as "comprises" or "comprising", will be understood to imply the inclusion of a 
stated element or integer or group of elements or integers but not the exclusion of any other 
element or integer or group of elements or integers. 

15 

One aspect of the present invention contemplates an isolated nucleic acid molecule comprising 
a sequence of nucleotides encoding or complementary to a sequence encoding an amino acid 
sequence having homology to a guanine nucleotide exchange factor (GEF) or a derivative of said 
gene regulator. 

20 

Another aspect of the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO: 1 ; 

25 (ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:2; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions to the 
nucleotide sequence set forth in (i), (ii) or (iii). 



30 

Even yet 



another aspect of the present invention provides a genetic construct comprising a vector 
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portion and an animal, more particularly a mammalian and even more particularly a human mcg7 
gene portion, which mcg7 gene portion is capable of encoding an MCG7 polypeptide or a 
functional or immunologically interactive derivative thereof. 

5 Still yet another asf>ect of the present invention contemplates a method of detecting a condition 
caused or facilitated by an aberration in mcg7, said method comprising determining the presence 
of a single or multiple nucleotide substitution, deletion and/or addition or other aberration to one 
or both alleles of said mcg7 wherein the presence of such a nucleotide substitution, deletion 
and/or addition or other aberration may be indicative of said condition or a propensity to develop 
10 said condition. 

Even still a further aspect of the present invention relates to a method of detecting a condition 
caused or facilitated by an aberration in rncg7, said method comprising screening for a single or 
multiple amino acid substitution, deletion and/or addition to MCG7 wherein the presence of such 
15 a mutation is indicative of or a propensity to develop said condition. 

Another aspect of the present invention contemplates a method for detecting MCG7 or a 
derivative thereof in a biological sample said method comprising contacting said biological 
sample with an antibody specific for MCG7 or its derivatives or homologues for a time and under 
20 conditions sufficient for an antibody-MCG7 complex to form, and then detecting said complex. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a representation showing similarity of MCG7 with GEFs of various organisms. 

25 

Figure 2(a) is a representation of the nucleotide sequence and corresponding amino acid 
sequence of nicg7. An alternative spliced exon is shown in the nucleotide sequence in lower case 
(nucleotides 183-288). 

30 Figure 2(b) is a representation of the partial nucleotide sequence and corresponding amino acid 
sequence of mcg7 but without the exon shown in Fig. 2(a). Amino acids have been numbered 
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from the first methionine codon (underlined). The cDNA molecules of Fig. 2(a) and Fig.2(b) 
differ by the inclusion and exclusion of the exon shown in Figure 2(a) in lower case. 

Figure 3 is a representation showing a comparison between MCG7 and a homologue from 
5 Caenorhabditis elegans using the BESTFIT algorithm. In the figure, the following sequences 
are underlined: 

EF-Hand= PROSITE DATABASE NO. PD0C00018 

la nematode DVDEEDEVEDIEF 
10 lb human DVDGDGHISQEEF 

nematode DHDRDGFISQEEF 
Ic human DQNQDGCISREEM 

nematode DVDMDGQISKDEL 

15 GUANINE NT BINDING REGION = BLOCKS DATABASE NO. BL00720B 

2 human HFVHVAEKLLQLQNFNTLMAVVGGLSHSSISRLKETH 
nematode KFVHVAKHLRKINNFNTLMSVVGGITHSSVARLAKTY 

DaG-PE BINDING DOMAIN = PROSITE DATABASE NO. PD0C00379 

20 3 human HNFQESNSLRPVACRHCKALILGIYKQGLKCRACGVNCHKQCKDRLSVEC 
nematode HNFHETTFLTPTTCNHCN KLLWGILRQGFKCKDCGLA VHSCCKSN AV AEC 

Figure 4 is a representation of an alignment of human and a partial (5' UTR and partial coding 
sequence) murine mcgl cDNA (GenBank Acc. No. W71787 and AA237373). The putative 
25 initiation codon is underlined. The murine sequence represents a composite of 2 partial cDNA 
-sequences from the EST database (accession numbers W71787 and AA237373). Nucleotide 
differences between human and murine sequences are shown in lower case lettering and identical 
residues are indicated with asterisks. 

30 Figure 5 is a representation of further 5' nucleotide and corresponding amino acid sequence for 
human mcgl. Nucleotide positions 1-321 were derived from GenBank Acc. No. AC000134 and 
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nucleotides 322 onwards from Fig. 2(a). Two in-frame initiation codons are underlined. 
Asterisks denote in-frame stop codons. 

Figure 6 is a graphical representation of a GDP release assay. □ Experiment #1 (mean of 
5 duplicates). 0 Experiment #2 (mean of duplicates). The exchange reaction contained 36pmols 
of GST-MCG (N-terminally truncated; encoded by Construct B in Fig. 7) and 1.6-12.8 pmols 
of recombinant GST-N-Ras.GDP. Reaction time 6 mins. 
Estimated reaction constants: 

K,,, = 2.1mM, V^^^ = 37pMol/6min/36pMol [Expt#l] 
10 = 1.5(jM, V,,^,, = 30.3pMol/6 min/36pMol [Expt#2] 

Figure 7 depicts various recombinant plasmids containing partial or full-length mcg7. 



1 5 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention provides an isolated nucleic acid molecule comprising a sequence of 
nucleotides encoding or complementary to a sequence encoding an amino acid sequence having 
homology lo a guanine nucleotide exchange factor (GEF) or a derivative of said gene regulator. 

20 

More particularly, the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides or a complementary form thereof selected from: 

(i) a nucleotide sequence set forth in SEQ ID NO:l; 

25 (ii) a nucleotide sequence encoding an amino acid sequence set forth in SEQ ID NO:2; 

(iii) a nucleotide sequence having at least about 40% similarity to the nucleotide sequence 
of (i) or (ii); and 

(iv) a nucleotide sequence capable of hybridizing under low stringency conditions to the 
nucleotide sequence set forth in (i), (ii) or (iii). 

30 

Preferably, the percentage similarity is at least about 50%. More preferably, the percentage 
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similarity is at least about 60%. 

Reference herein to a low stringency at 42°C includes and encompasses from at least about 1% 
v/v to at least about 15% v/v formamide and from at least about IM to at least about 2M salt for 

5 hybridisation, and at least about IM to at least about 2M salt for washing conditions. Alternative 
stringency conditions may be applied where necessary, such as medium stringency, which 
includes and encompasses from at least about 16% v/v to at least about 30% v/v formamide and 
from at least about 0.5M to at least about 0.9M salt for hybridisation, and at least about 0.5M 
to at least about 0.9M salt for washing conditions, or high stringency, which includes and 

10 encompasses from at least about 31% v/v to. at least about 50% v/v formamide and from at least 
about O.OIM to at least about 0. 15M salt for hybridisation, and at least about O.OIM to at least 
about 0. 1 5M salt for washing conditions. 

The term "similarity" as used herein includes exact identity between compared sequences at the 
15 nucleotide or amino acid level. Where there is non-identity at the nucleotide level, "similarity" 
includes differences between sequences which result in different amino acids that are nevertheless 
related to each other at the structural, functional, biochemical and/or conformational levels. 
Where there is non-identity at the amino acid level, "similarity" includes amino acids that are 
nevertheless related to each other at the structural, frinctional, biochemical and/or conformational 
20 levels. 

The present invention extends to nucleic acid molecules with percentage similarities of 
approximately 65%, 70%, 75%, 80%, 85%, 90% or 95% or above or a percentage in between. 

25 The nucleic acid molecule of the present invention is hereinafter referred to as constituting the 
"mcj?7' gene. The protein encoded by mcg7 is referred to herein as "MCG7" and is involved in 
signal transduction. 

The present invention extends to the naturally occurring genomic mcgl nucleotide sequence or 
30 corresponding cDNA sequence or to derivatives thereof. Derivatives contemplated in the 
present invention include fragments, parts, portions, mutants, homologues and analogues of 
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MCG7 or the corresponding genetic sequence. Derivatives also include single or multiple amino 
acid substitutions, deletions and/or additions to MCG7 or single or multiple nucleotide 
substitutions, deletions and/or additions to mcg7. Derivatives also includes modifications to 
nucleotide bases or amino acid residues to, for example, alter glycosylation sites or amino acid 
5 side chains. "Additions" to the amino acid or nucleotide sequences include fusions with other 
peptides, polypeptides or proteins or fusions to nucleotide sequences. Reference herein to 
"MCG7" or ''mcg7" includes references to all derivatives thereof including functional derivatives 
and immunologically interactive derivatives of MCG7. 

10 The mcg7 of the present invention is particularly exemplified herein from humans and in 
particular from human chromosome 1 lql3. 

The present invention also extends, however, to a range of homologues from, for example, 
primates, Hvestock animals (eg. sheep, cows, horses, donkeys, pigs), companion animals (eg. 
15 dogs, cats) laboratory test animals (eg. rabbits, mice, rats, guinea pigs), birds (eg. chickens, 
ducks, geese, parrot), insects, nematodes, eukaryotic microorganisms and captive wild animals 
(eg. deer, foxes, kangaroos). Reference herein to meg? or MCG7 includes reference to these 
molecules of human origin as well as novel forms of non-human origin. 

20 The nucleic acid molecules of the present invention may be DNA or RNA. When the nucleic 
acid molecule is in DNA form, it may be genomic DNA or cDNA. RNA forms of the nucleic 
acid molecules of the present invention are generally mRNA. 

Although the nucleic acid molecules of the present invention are generally in isolated form, they 
25 may be integrated into or ligated to or otherwise fused or associated with other genetic 
molecules such as vector molecules and in particular expression vector molecules. Vectors and 
expression vectors are generally capable of replication and, if applicable, expression in one or 
both of a prokaryotic cell or a eukaryotic cell. Preferably, prokaryotic cells include E. coli, 
Bacillus sp and Pseudomonas sp. Preferred eukaryotic cells include yeast, fungal, mammalian 
30 and insect cells. 
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Accordingly, another aspect of the present invention contemplates a genetic construct comprising 
a vector portion and an animal, more particularly a mammalian and even more particularly a 
human mcg7 gene portion, which mcg7 gene portion is capable of encoding an mcg7 polypeptide 
or a functional or immunologically interactive derivative thereof. 

5 

Preferably, the meg? gene portion of the genetic construct is operably linked to a promoter on 
the vector such that said promoter is capable of directing expression of said meg? gene portion 
in an appropriate cell. 

10 In addition, the mcg7 gene portion of the genetic construct may comprise all or part of the gene 
fused to another genetic sequence such as a nucleotide sequence encoding glutathione-S- 
transferase or part thereof. 

The present invention extends to such genetic constructs and to prokaryotic or eukaryotic cells 
1 5 comprising same. 

It is proposed in accordance with the present invention that MCG7 is a GEF involved in signal 
transduction. Mutations in mcg7 or MCG7 may result in defective control of cell proliferation 
leading to the development of or a propensity to develop various types of cancer. 

20 

A deletion or aberration in the mcg7 gene may also be important in the detection of cancer or 
a propensity to develop cancer. An aberration may be a homozygous mutation or a 
heterozygous mutation. The detection may occur at the foetal or post-natal level. Detection 
may also be at the germline or somatic cell level. Furthermore, a risk of developing cancer may 
25 be determined by assaying for aberrations in the parents of a subject under investigation. 

According to this aspect of the present invention, there is contemplated a method of detecting 
a condition caused or facilitated by an aberration in mcg7, said method comprising determining 
the presence of a single or multiple nucleotide substitution, deletion and/or addition or other 
30 aberration to one or both alleles of said mcg7 wherein the presence of such a nucleotide 
substitution, deletion and/or addition or other aberration may be indicative of said condition or 
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a propensity to develop said condition. 

The nucleotide substitutions, additions or deletions may be detected by any convenient means 
including nucleotide sequencing, restriction fragment length polymorphism (RFLP), polymerase 
5 chain reaction (PGR), oligonucleotide hybridization and single stranded conformation 
polymorphism analysis (SSCP) amongst many others. An aberration includes modification to 
existing nucleotides such as to modify glycosylation signals amongst other effects. 

In an altemalive method, aberrations in the mcg7 gene are detected by screening for mutations 
10 inMCGV. 

A mutation in MCG7 may be a single or multiple amino acid substitution, addition and/or 
deletion. The mutation in meg? may also result in either no translation product being produced 
or a product in truncated form. A mutation may also be an altered glycosylation pattern or the 
15 introduction of side chain modifications to amino acid residues. 

According to this aspect of the present invention, there is provided a method of detecting a 
condition caused or facilitated by an aberration in mcg7, said method comprising screening for 
a single or multiple amino acid substitution, deletion and/or addition to MCG7 wherein the 
20 presence of such a mutation is indicative of or a propensity to develop said condition. 

A particularly convenient means of delecting a mutation in MCG7 is by use of antibodies. 

Accordingly another aspect of the present invention is directed to antibodies to MCG7 and its 
25 derivatives. Such antibodies may be monoclonal or polyclonal and may be selected from 
naturally occurring antibodies to MCG7 or may be specifically raised to MCG7 or derivatives 
thereof. In the case of the latter, MCG7 or its derivatives may first need to be associated with 
a carrier molecule. The antibodies to MCG7 of the present invention are particularly useful as 
diagnostic agents. 

30 

For example, antibodies to MCG7 and its derivatives can be used to screen for wild-type MCG7 
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or for mutated MCG7 molecules. The latter may occur, for example, during or prior to certain 
cancer development. A differential binding assay is also particularly useful. Techniques for such 
assays are well known in the art and include, for example, sandwich assays and ELISA. 
Knowledge of normal MCG7 levels or the presence of wild-type MCG7 may be important for 
5 diagnosis of certain cancers or a predisposition for development of cancers or for monitoring 
certain therapeutic protocols. 

As stated above antibodies to MCG7 of the present invention may be monoclonal or polyclonal 
or may be fragments of antibodies such as Fab fragments. Furthermore, the present invention 
10 extends to recombinant and synthetic antibodies and to antibody hybrids. A "synthetic antibody" 
is considered herein to include fragments and hybrids of antibodies. 

For example, specific antibodies can be used to screen for wild-type MCG7 molecule or specific 
mutant molecules such as molecules having a certain deletion. This would be important, for 
1 5 example, as a means for screening for levels of MCG7 in a cell extract or other biological fluid 
or purifying MCG7 made by recombinant means from culture supernatant fluid or purified from 
a cell extract. Techniques for the assays contemplated herein are known in the art and include, 
for example, sandwich assays and ELISA. 

20 It is within the scope of this invention to include any second antibodies (monoclonal, polyclonal 
or fragments of antibodies or synthetic antibodies) directed to the first mentioned antibodies 
discussed above. Both the first and second antibodies may be used in detection assays or a first 
antibody may be used with a commercially available anti-immunoglobulin antibody. An antibody 
as contemplated herein includes any antibody specific to any region of wild-type MCG7 or to a 

25 specific mutant phenotype or to a deleted or otherwise altered region. 

Both polyclonal and monoclonal antibodies are obtainable by immunization of a suitable animal 
or bird with MCG7 or its derivatives and either type is utilizable for immunoassays. The 
methods of obtaining both types of sera are well known in the art. Polyclonal sera are less 
30 preferred but are relatively easily prepared by injection of a suitable laboratory animal or bird 
with an effective amount of MCG7 or antigenic parts thereof or derivatives thereof, collecting 
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serum from the animal or bird, and isolating specific sera by any of the known immunoadsorbent 
techniques. Although antibodies produced by this method are utilizable in virtually any type of 
immunoassay, they are generally less favoured because of the potential heterogeneity of the 
product. 

5 

The use of monoclonal antibodies in an immunoassay is particularly preferred because of the 
ability to produce them in large quantities and the homogeneity of the product. The preparation 
of hybridoma cell lines for monoclonal antibody production derived by fusing an immortal cell 
line and lymphocytes sensitized against the immunogenic preparation can be done by techniques 
10 which are well known to those who are skilled in the art. 

Another aspect of the present invention contemplates a method for detecting MCG7 or a 
derivative thereof in a biological sample said method comprising contacting said biological 
sample with an antibody specific for MCG7 or its derivatives or homologues for a time and under 
1 5 conditions sufficient for an antibody-MCG7 complex to form, and then detecting said complex. 

Preferably, the biological sample is a cell extract from a human or other animal or a bird. 

The presence of MCG7 may be accomplished in a number of ways such as by Western blotting 
20 and ELISA procedures. A wide range of immunoassay techniques are available as can be seen 
by reference to US Patent Nos. 4,016,043. 4, 424,279 and 4,018,653. These include both single- 
site and two-site or "sandwich" assays of the non-competitive types, as well as traditional 
competitive binding assays. These assays also include direct binding of a labelled antibody to a 
target. 

25 

Sandwich assays are among the most useful and commonly used assays and are favoured for use 
in the present invention. A number of variations of the sandwich assay technique exist, and all 
are intended to be encompassed by the present invention. Briefly, in a typical forward assay, an 
unlabelled antibody is immobilized on a solid substrate and the sample to be tested brought into 
30 contact with the bound molecule. After a suitable period of incubation, for a period of time 
sufficient to allow formation of an antibody-antigen complex, a second antibody specific to the 
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antigen, labelled with a reporter molecule capable of producing a detectable signal is then added 
and incubated, allowing time sufficient for the formation of another complex of antibody-antigen- 
labelled antibody. Any unreacted material is washed away, and the presence of the antigen is 
determined by observation of a signal produced by the reporter molecule. The results may either 
5 be qualitative, by simple observation of the visible signal, or may be quantitated by comparing 
with a control sample containing known amounts of hapten. Variations on the forward assay 
include a simultaneous assay, in which both sample and labelled antibody are added 
simultaneously to the bound antibody. These techniques are well known to those skilled in the 
an, including any minor variations as will be readily apparent. In accordance with the present 
10 invention the sample is one which might contain MCG7 including cell extract or, tissue biopsy. 
The sample is, therefore, generally a biological sample comprising biological fluid but also 
extends to fermentation fluid and supernatant fluid such as from a cell culture. 

In the typical forward sandwich assay, a first antibody having specificity for the MCG7 or an 
15 antigenic part thereof or a derivative thereof or antigenic parts thereof, is either covalently or 
passively bound to a solid surface. The solid surface is typically glass or a polymer, the most 
commonly used polymers being ceUulose. polyacrylamide, nylon, polystyrene, polyvinyl chloride 
or polypropylene. The solid supports may be in the form of tubes, beads, discs of microplates, 
or any other surface suitable for conducting an immunoassay. The binding processes are well- 
20 known in the an and generally consist of cross-linking covalently binding or physically adsorbing, 
the polymer-antibody complex is washed in preparation for the test sample. An aliquot of the 
sample to be tested is then added to the solid phase complex and incubated for a period of time 
sufficient (e.g. 2-40 minutes) and under suitable conditions (e.g. 25°C) to allow binding of any 
subunit present in the antibody. Following the incubation period, the antibody subunit solid 
25 phase is washed and dried and incubated with a second antibody specific for a portion of the 
hapten. The second antibody is linked to a reporter molecule which is used to indicate the 
binding of the second antibody to the hapten. 



30 



An alternative method involves immobilizing the target molecules in the biological sample and 
then exposing the immobilized target to specific antibody which may or may not be labelled with 
a reporter molecule. Depending on the amount of target and the strength of the reporter 
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molecule signal, a bound target may be detectable by direct labelling with the antibody. 
Alternatively, a second labelled antibody, specific to the first antibody is exposed to the target- 
first antibody complex to form a target-first antibody-second antibody tertiary complex. The 
complex is detected by the signal emitted by the reporter molecule. 

5 

By "reporter molecule" as used in the present specification, is meant a molecule which, by its 
chemical nature, provides an analytically identifiable signal which allows the detection of antigen- 
bound antibody. Detection may be either qualitative or quantitative. The most commonly used 
reporter molecules in this type of assay are either enzymes, fluorophores or radionuclide 

10 containing molecules (i.e. radioisotopes) and chemiluminescent molecules. 

In the case of an enzyme immunoassay, an enzyme is conjugated to the second antibody, 
generally by means of gluiaraldehyde or periodate. As will be readily recognized, however, a 
wide variety of different conjugation techniques exist, which are readily available to the skilled 
artisan. Commonly used enzymes include horseradish peroxidase, glucose oxidase, beta- 

15 galactosidase and alkaline phosphatase, amongst others. The substrates to be used with the 
specific enzymes are generally chosen for the production, upon hydrolysis by the corresponding 
enzyme, of a detectable colour change. Examples of suitable enzymes include alkaline 
phosphatase and peroxidase. It is also possible to employ fluorogenic substrates, which yield a 
Huorescent product rather than the chromogenic substrates noted above. In all cases, the 

20 enzyme-labelled antibody is added to the first antibody hapten complex, allowed to bind, and 
then the excess reagent is washed away. A solution containing the appropriate substrate is then 
added to the complex of antibody-aniigen-antibody. The substrate will react with the enzyme 
linked to the second antibody, giving a qualitative visual signal, which may be further quantitated, 
usually spectrophotomeirically, to give an indication of the amount of hapten which was present 

25 in the sample. "Reporter molecule" also extends to use of cell agglutination or inhibition of 
agglutination such as red blood cells on latex beads, and the like. 

Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be chemically 
coupled to antibodies without altering their binding capacity. When activated by illumination 
30 with light of a particular wavelength, the fluorochrome-labelled antibody adsorbs the light 
energy, inducing a state to excitability in the molecule, followed by emission of the light at a 
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characteristic colour visually detectable with a light microscope. As in the EIA, the fluorescent 
labelled antibody is allowed to bind to the first antibody-hapten complex. After washing off the 
unbound reagent, the remaining tertiary complex is then exposed to the light of the appropriate 
wavelength the fluorescence observed indicates the presence of the hapten of interest. 
5 Immunofluorescence and EIA techniques are both very well established in the art and are 
particularly preferred for the present method. However, other reporter molecules, such as 
radioisotope, chemiluminescent or bioluminescent molecules, may also be employed. 

As slated above, the present invention extends to genetic constructs capable of encoding MCG7 
10 or functional derivatives thereof. Such genetic constructs are also contemplated to be useful in 
modulating expression of specific genes in which mcgl is involved in tissue-specific or temporal 
regulation. 

Accordingly, another aspect of the present invention is directed to a genetic construct comprising 
15 a nucleotide sequence encoding a peptide, polypeptide or protein and mcgl or a functional 
derivative or homologue thereof capable of modulating the expression of said nucleotide 
sequence. 

The present invention is further described with reference to the following non-limiting Examples. 

20 
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EXAMPLE 1 

A human gene (designated mcgT) was identified and isolated from chromosome llql3 which 
encodes a protein that bears striking homology with guanine nucleotide exchange factors (GEFs) 
5 from a wide variety of organisms (Fig. 1). 

EXAMPLE 2 

The composite mcg7 cDNA sequence is at least 2.4kb in length and Figure 2(a) shows a 
10 predicted translation product of at least 609 amino acids beginning at methionine 120. An 
alternative stan site due to alternate exon splicing (indicated in lower case) may yield a protein 
of 671 amino acids starting at methionine 58 (Fig. 2a). 

EXAMPLE 3 

15 

An mcg7 homologue from C. elegans has been identified, the product of which is highly 
conserved with that of MCG7 (Fig. 3). There are several salient features of the protein which 
have been underlined in Fig. 3 - namely: a guanine nucleotide binding region, a diacylglycerol 
binding region, and "EF-hand'*-calcium binding regions. In addition, there are several potential 
20 cAMP, protein kina.se C, and casein kina.se II phosphorylation sites, as well as a number of 
potential sites for glycosylaiion (not indicated). 

EXAMPLE 4 

25 A number of partial human and murine EST clones exist for mcgl. The GenBank database 
contains a cDNA (Acc. no. Y 12336) encoding a full-length open reading frame (ORF) for human 
nicg? as well as a partial murine mcg7 ORF (Y 12339). In addition, the complete genomic 
sequence of the human mcg7 gene is contained within GenBank entry AC000134. 



30 
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EXAMPLE 5 

The best characterised GEFs are members of the family of ras oncoproteins, which play a pivotal 
role in signal transduction and when mutated are responsible for tumour development. A variety 
5 of therapeutic regimes for cancer treatment have been designed to specifically interfere with the 
ras signalling pathways. There is potential, therefore that the product of mcg7 could also be a 
target for such clinical strategies. 

EXAMPLE 6 

10 

The nucleotide sequence for mcg7 cDNA was extended 5' with genomic DNA sequence from 
Genbank accession number AC000134 (positions 1-321) and analysed for additional coding 
sequence 5' to the putative initiation codon (nt 68 1-683) (Fig. 5). An additional in-frame ATG 
occurs at position nt 495-497 when the alternatively splice exon (position nt 504-609) is present 
15 (also shown in Fig. 2(a)). This closely matches the Kozak consensus. When this exon is absent, 
then the ATG is not in- frame and other possible initiation codons are absent (resulting translation 
shown in lower case lettering) (also shown in Fig. 2(b)). Further evidence that the initiation 
codon at position nt 681-683 is the true initiation site is given in Figure 4. 

20 Alignment of human and a partial murine mcg7 cDNA sequences is shown in Figure 4. The 
putative initiation codon is at position nt 360-362. Both murine ESTs appear to have an 
upstream in-frame stop codon at position nt 326-328, downstream of the differentially spliced 
exon and the sequence alignment thus suggests that this region represents the 5' UTR of mcgl. 

25 Furthermore, similarity with the C. elegans homologue (Fig. 13) strongly suggest that the ATG 
codon at position nt 360-362 encodes the N-terminus of MCG7. 
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EXAMPLE 7 

Figure 6 shows data from experiments indicating that a truncated version of MCG7 when 
expressed as a GST fusion protein (construct B in Fig. 7) can function as a Ras-guanine 
5 nucleotide exchange factor. In brief, Ras (unprocessed and as a GST fusion protein) is loaded 
with ^H-GDP then incubated in the presence of excess cold GTP ± GST-MCG7. Full details of 
this assay can be found in Porfiri et al. J. BioL Chem. 2^, 22672-22677 (1994). 

EXAMPLE 8 

10 

Nucleotide sequence data generated from cosmid clone cSRL-20hl2 with the T7 primer 
(Promega, and Applied Biosystems Incorporated dye terminator sequencing kit) was aligned to 
the GenBank Expressed Sequence Tag (EST) database using the program BLASTN (Altschul 
ei al, 1990) and was found to match GenBank entries T78563 (clone 1 13434) TO9103 (clone 
15 HIBBP12) and AA035643 (clone 471819). EST clones 1 13434 and 471819 were obtained from 
Genome Systems Inc. and these DNAs were sequenced on both strands with gene-specific 
primers (Table 1) to generate the cDNA sequence of mcgl shown in Figures 2(a) and (b). 

The cDNA sequence o( nicgl was translated in all possible reading frames and compared to the 
20 GenBank non-redundant protein dalab:Lse using the program BLASTX (Altschul et al, 1990) and 
the coding region was assigned on the basis of showing homology to the C. elegans protein 
F25B3.3 (Figure 3). The mcgl cDNA composite was suspected to contain a single nucleotide 
error that originated from clone 471819 and the correct nucleotide sequence was, therefore, 
sought by reverse transcription-polymerase chain reaction (RT-PCR) of the cDNA fragment 
25 from a human cDNA pool. Total RNA was extracted from a human lymphoblastoid cell line 
using an RNeasy Mini Kit (Qiagen). cDNA synthesis was conducted with the reverse 
transcriptase Superscript II RNaseH- (GIBCO, BRL) and random hexamers using the procedure 
recommended by the manufacturer (GIBCO, BRL). One fortieth of the cDNA mix was 
subjected to 35 cycles of PGR using the following cycling conditions: 94°C for 30 seconds, 58°C 
30 for 30 seconds and 72°C for 90 seconds. The 50jul reaction mix consisted of Ix reaction buffer 
(Dade Scientific), 2mM dNTP mix, 20pmol of primers (see Table 1) MCG7UF (within the 
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variably spliced exon of Figure 2(b), between nucleotide positions 184-201) and SGCADRV2 
(between nucleotide positions 866-846 of Figure 2(a)) and 10 units of Dynazyme (Dade 
Scientific). The resulting PGR product was cloned into the pGEM-T vector (Promega) using 
standard methodology and sequenced using gene-specific prinners. The correct nucleotide 
5 sequence ofmcgl (as shown in Figure 2(a)) matches that of the recently release GenBank entry 
Y 12336. A partial mouse mcgl cDNA sequence can also be found in GenBank entry Y 12339. 

EXAMPLE 9 

10 The coding sequence of mcgl was cloned into vectors for expression in both bacterial and 
mammalian cells. In addition to the full-length constructs, the deletion constructs shown in 
Figure 7 were designed to retain the guanine nucleotide exchange (GEF) domain. For 
prokapy^olic expression, the mcgl coding region was inserted downstream of and in-frame with 
the Sj26 cassette of the pGEX (Pharmacia) series of vectors (Smith and Johnson, 1988) using 

15 standard cloning techniques (Sambrook et al 1989). For mammalian expression, the mcgl 
coding sequence was first m>'c-tagged at the N-terminus and then ligated into the expression 
vector pc Exv-n using standard cloning techniques. Ligation junctions of the constructs were 
sequences as the cloning strategies inadvertently changed or introduced additional amino acids 
as shown below. 

20 

Construct (A): EST clone 1 13434 was digested with Apal (Figure 2(a), nucleotide positions 
1022 to >2416 (within the vector)), blunt-ended with T4 DNA polymerase according to the 
specifications of the manufacturer (New England Biolab) and ligated into the Sma\ site of pGEX- 
3X. 

25 

Sequence of the pGEX and mcgl (underlined) junction: 
pGEX-3X mcgl (1022) 

Sj26 ... GGG ATC CC C CTG GTC [SEQ ID NO:5] 

additional amino acids Gly He Pro 

30 

Construct (B): EST clone 1 13434 was digested with EcoKl (Figure 2(a), nucleotide positions 
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<695 (within the vector) to 1711) and iigated into the EcoRl site of pGEX-1. 

Sequence of the pGEX and mcgl (underlined) junction: 
pGEX-1 mcgl {695) 

5 Sj26 ... GAA TTC GGC ACG AG C CGA CGG [SEQ ID NO:6] 

additional amino acids Glu Phe Gly Thr Ser 

Construct (C): fijll-length mcgl: The pGEM-T clone containing the 5' end of the mcgl coding 
region was digested with (subsequently blunt-ended with T4 DNA polymerase) and BstXl 
10 to liberate the fragment between nucleotide positions 336 and 830 of Figure 2(a). Clone 1 13434 
was digested with BstXl and Hindlll (vector derived) to liberate a fragment between nucleotide 
positions 830 > and 2416 (vector derived) of Figure 2(a). A pGEM-1 Izf vector (Promega) 
containing the myc-iag (constructed by J. Hancock) was digested with Apal (subsequently blunt- 
ended with T4 DNA polymerase) and Hindlll, and Iigated with the 2 inserts described above. 

15 

Sequence of the ;7jyc-tag//nc^' 7 junction: 

myc-tag vector Ba/rtil meg? 5' (JTR (337) start 

ATGGAGCAGAAGCTGATCTCCGAGGAGGACCTG CCCGGGGCAGCTggatccG CAGCCCACCCCGGGCC GGCGGCCATG 
MEQKLISEEDL PGAAGS AAHPAPAAM 

additional amino acids 

20 

The myc-tagged full-length mcgl insert in pGEM-1 Izf [SEQ ID NO:7] was then excised with 
Sad and Hindlll (both vector derived) and directionally cloned into the mammalian expression 
vector pEXV (Beranger a/. 1994), 

25 

Construct (D): Construct (C) in pGEM-llzf was sequentially digested with Hindlll (this site 
was subsequently blunt-ended with T4 DNA polymerase) then BamHl, and Iigated into pGEX- 
2T digested with BamHl and Smal. Digestion with BamHl, and Iigated into pGEX-2T digested 
with BamHl and Smal. Digestion with BamHl removed the myc-tag of Construct (C). 



30 
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Sequence of the pGEX and mcgl [SEQ ID N0:9] (underlined) junction: 



pGEX-2 BaniHl mcg7 (337) 

Sj26...gga tec G CA GCC CAC CCC GGG CCG GCG GCC ATG 
Gly Ser Ala Ala His Pro Ala Pro Ala Ala Met 
additional amino acids 



EXAMPLE 10 

10 Overnight bacterial cultures containing the pGEX plasmid were used to inoculate 500ml of Luna 
Broth media containing 50fug/ml ampicillin. The cultures were grown to an OD of -0.8 and then 
induced with ImM of IPTG for up to 3 hours at 37°C. The bacteria were pelleted and 
resuspended in 15 ml of STE buffer (lOmM Tris pH 8.0, 150 mM NaCl and ImM EDTA) with 
1 mg/ml lysozyme. The mixture was left on ice for more than 1 hour and subsequent steps were 

15 performed at 4°C. Protease inhibitors aprotinin, pepstatin and leupeptin were added at final 
concentrations of 25A/g/ml. prior to the addition of Triton-X-100 (2% v/v final) and n-lauroyl 
sarcosine (1.5% final). The lysate was sonicated for -1 minute and pelleted at 14,000 x g for 15 
minutes. 100 ^l of 50% w/v glutathione-sephadex bead slurry (in PBS) was added per ml of 
supernatant. Following a 30 minute incubation at 4°C, the beads were washed three times with 

20 NETN (20mM Tris-HCl pH 8.0. lOOmM NaCl, ImM EDTA, 0.5% NP40), once with NETN- 
HS (equivalent to NETN but with IM NaCl), and once in NETN. The bound protein was 
directly analysed by SDS-polyacr>'lamide gel electrophoresis (PAGE) as described below or the 
bound protein was eluted from the beads with the following elution buffer (50mM Tris pH 8.0, 
150mM NaCl, 5mM MgCU, ImM DTT, lOmM reduced glutathione) for use in GDP release 

25 assays. 

EXAMPLE 11 

Twenty microlitres of GST-sepharose-bound MCG7 were added to an equal volume of 2 x 
30 sample loading dye (lOOmM Tris pH6.8, 2% v/v mercaptoethanol, 4% w/v SDS, 0.2% w/v 
bromophenol blue, 20% v/v glycerol), boUed for 5 min and loaded onto a 7.5% w/v SDS-PAGE 
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gel (Sambrook et ai 1989). The Coomassie brilliant blue stained gel (Sambrook et ai 1989) 
typically displayed a protein doublet, running between 87-95 kDa consisting of the MCG7-GST 
fusion and a slightly smaller, co-purified contanainating E. coli protein of -lOSkDa. The 
calculated molecular weight of full-length MCG7 is 77.5 kDa (Construct (D)) and the GST 
5 component has a molecular weight of 26kDa hence the recombinant protein runs slightly smaller 
than predicted. A Western blot of the same gel probed with anti-GST antibody yields an MCG7- 
specific band at the same position as that of the stained gel. 

EXAMPLE 12 

10 

Assumptions: (a) GST-Ras molecular weight = 50 kD; (b) Concentration of GST-Ras solution 
= Img/ml = 20a/M; (c) [-H]-GDP is ImCi/ml and 13.3Ci/mmol, therefore [ H]-GDP 
concentration = 15 ^W[ and 1 pmol [•^H]-GDP= 1 5,466 cpm; (d) Elution buffer = Buffer E = 20 
mM Tris-CI, pH7.5; 50mVl NaCl; 5mM MgCU; ImM DTT (added just before use). Buffer E 
15 + BSA= Buffer E+lmg/ml BSA (added just before use). 

Mix together, in the following order and mix well after each addition: 

10^1 {=\0^g) GST-Ras (@lmg/ml in Buffer E), 463a/1 Buffer E BSA, 7/il [^H]-GDP, lOml 
490 A/M EDTA. Incubate @ RT for 10 min. Add IOa^I 0.5 M MgCl2 and mix well. Incubate 
20 @ RT for 10 min. Place on ice. During the first incubation the excess EDTA concentration is 
5mM, during the second incubation the excess Mg concentration is 5mM. The [^H]-GDP 
concentration is and the final concentration of GST-Ras is 400nM. Thus 20ml of the final 
mix will contain Bpmol of GST-Ras protein. Specific activity of GDP is 15,446 cpm/pmol x 
(1/1.4)= 11,047 cpm/pmol. 

25 

EXAMPLE 13 

Exchange Ras with labelled GDP as above. Add unlabelled GTP (stock = lOOmM, pH7) to 1 
mM. Adjust Mg concentration by adding 5/il 0.5 EDTA to labelled Ras, 5iA 0.5M EDTA to 
30 500/^1 MCG7, and 5/il 0.5M EDTA to 500/^1 Buffer E + BSA. On ice set up microfuge tubes 
with 40az1 Ras-GDP (in triplicate) with A0^\ MCG7 or Buffer E -h BSA (control). Transfer tubes 
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to heat block @ 25°C and incubate for 10, 20 or 30 min. Stop exchange reactions with 1ml of 
ice cold buffer E and place on ice. Pre-soak nitrocellulose filters, pore size 45//m, in Buffer E. 
Assemble the vacuum manifold apparatus (Millipore) with wet filters and plug the wells with 
rubber bunds. Switch on the vacuum pump. Remove the first plug, aliquot the sample and once 
5 it has been sucked through, wash the filter with 10ml of ice cold Buffer E. Remove next plug 
etc and continue round the manifold. Take manifold apart. Pin the filters to a pin board reserved 
for [^H]. Air dry. Take up in 4ml scintillation fluid and count. These studies have been carried 
out with a truncated MCG7-GST fusion protein (amino acids 341 of Figure 2a to stop encoded 
within construct B). 

10 

Those skilled in the art will appreciate that the invention described herein is susceptible to 
variations and modifications other than those specifically described. It is to be understood that 
the invention includes all such variations and modifications. The invention also includes all of 
1 5 the steps, features, compositions and compounds referred to or indicated in this specification, 
individually or collectively, and any and all combinations of any two or more of said steps or 
features. 
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TABLE 1 
meg 7-specific oligonucleotides 



name 


sequence (5' to 3') 


SEQ ID NOs. 


M1044R 


GGA CAA AGT GTG TGA TGA AGO 


SEQ ID NO: 1 1 


MCG7-GEF-REV2 


CTC ATC CTC CGT CTG ATA CTG 


SEQ ID NO: 12 


M7R 


GTA GAT GTG GAT GAG CTT GG 


SEQ ID NO: 13 


MCG7 CA FOR 


AGG TGG AGA ATG GTG AAGG 


SEQ ID NO: 14 


MCG7-GEF-REV 


GTG ATA GTC TGT CTC CTA CT 


SEQ ID NO: 15 


MCG7 GEF FOR 


ACA TAG ACA GCG TGC CTA CC 


SEQ ID NO: 16 


MCG7-PKC-REV 


TAG AAC CTT AGG GAG ACC AG 


SEQ ID NO: 17 


MCG7-PKC-FOR 


TGC TGA GCC TGC TCA CGG TG 


SEQ ID NO: 18 


T09103F 


CAA GTG AAC AGG AGG TCC 


SEQ ID NO: 19 


M7F 


GAG TAT CTC AAG GAG CAG CTG 


SEQ ID NO:20 


MCG7UF 


GGT TGG GTC GGA GCC CGG 


SEQ ID NO:21 


SGCADRV2 


GGA GCG ATA CTC CAA GTA GGT 


SEQ ID NO:22 



P.AOPER\EJH\MCC7II.PR V . 2yim 



1. 



2. 



4. 



24- 



BIBLIOGRAPHY 



Altschul, S.F., Gish, W., Miller, W., Myers. E.W., and Lipman, D.J. (1990) J. Mol. 
Biol. 215: 403-410. 



Sambrook, J., Fritsch, E.F.. and Maniatis, T. (1989) Molecular Cloning. A 
Laboratory Manual. 
3. Smith, D.B., and Johnson, K.S. (1988) Gene 67: 31-40. 

Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) Nucleic Acids Res. 22: 



4673-4680. 



Beranger, F., Paterson, H., Powers, S., de Gunzburg, J. and Hancock, J.F. (1994) 
Molecular and Cellular Biology 14: 744-758. 



P \OPER\EJH\MCC7n.PRV • 22/I/V« 



-25 - 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: The Council of The Queensland Institute for Medical Research 

(ii) TITLE OF INVENTION: A NOVEL GENE AND USES THEREFOR 

(iii) NUMBER OF SEQUENCES: 22 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: DAVIES COLLISON CAVE 

(B) STREET: 1 LITTLE COLLINS STREET 

(C) CITY: MELBOURNE 

(D) STATE: VICTORIA 

(E) COUNTRY: AUSTRALIA 

(F) ZIP: 3000 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: AUSTRALIAN PROVISIONAL 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: HUGHES, DR E JOHN L 
(C) REFERENCE/DOCKET NUMBER: EJH/AF 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: +61 3 9254 2777 

(B) TELEFAX: +61 3 9254 2770 

(C) TELEX: AA 31787 
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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2415 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA 



( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 3 . . 2188 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CG ATT TCA TTC CTC GCT CCC CAC AGG TCC CTC TCC CCA AAA TAT TCC 47 
He Ser Phe Leu Ala Pro His Arg Ser Leu Ser Pro Lys Tyr Ser 
^5 10 15 

CAT CTT GTC CTA GCC CAT CCC CCA GAC TAT CTC AAG GAC CAG CTG TCC 9 5 

His Leu Val Leu Ala His Pro Pro Asp Tyr Leu Lys Asp Gin Leu Ser 
2 0 2 5 3 0 

CCA CGC CCC CGA CCT CCA CTA GGC CTG TGC CAC CCG CTG CCT GCA GGA 14 3 

Pro Arg Pro Arg Pro Pro Leu Gly Leu Cys His Pro Leu Pro Ala Gly 
35 40 45 

AG A CGC CCG GTC CCG GGC CGG GTT AGC CCC ATG GGA ACG CAG CGC CTG 191 
Arg Arg Pro Val Pro Gly Arg Val Ser Pro Met Gly Thr Gin Arg Leu 
^0 55 60 

TGT GGC CGC GGG ACT CAA GGC TGG CCT GGC TCA AGT GAA CAG CAC GTC 239 
Cys Gly Arg Gly Thr Gin Gly Trp Pro Gly Ser Ser Glu Gin His Val 
65 70 75 

CAG GAG GCG ACC TCG TCC GCG GGT TTG CAT TCT GGG GTG GAC GAG CTG 2 87 

Gin Glu Ala Thr Ser Ser Ala Gly Leu His Ser Gly Val Asp Glu Leu 
BO 85 90 95 

GGG GTT CGG TCC GAG CCC GGT GGG AGG CTC CCG GAG CGC AGC CTG GGC 335 
Gly Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu Arg Ser Leu Gly 
100 105 110 

CCA GCC CAC CCC GCG CCG GCG GCC ATG GCA GGC ACC CTG GAC CTG GAC 383 
Pro Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr Leu Asp Leu Asp 

120 125 

AAG GGC TGC ACG GTG GAG GAG CTG CTC CGC GGG TGC ATC GAA GCC TTC 431 
Lys Gly Cys Thr Val Glu Glu Leu Leu Arg Gly Cys He Glu Ala Phe 

135 140 

GAT GAC TCC GGG AAG GTG CGG GAC CCG CAG CTG GTG CGC ATG TTC CTC 47Q 
Asp Asp Ser Gly Lys Val Arg Asp Pro Gin Leu Val Arg Met Phe Leu 
145 150 155 

ATG ATG CAC CCC TGG TAC ATC CCC TCC TCT CAG CTG GCG GCC AAG CTG 52 7 

Met Met His Pro Trp Tyr He Pro Ser Ser Gin Leu Ala Ala Lys Leu 
-LbU 165 170 175 

CTC CAC ATC TAC CAA CAA TCC CGG AAG GAC AAC TCC AAT TCC CTG CAG 57 5 

Leu His He Tyr Gin Gin Ser Arg Lys Asp Asn Ser Asn Ser Leu Gin 
180 185 190 

GTG AAA ACG TGC CAC CTG GTC AGG TAC TGG ATC TCC GCC TTC CCA GCG 62 3 
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Val Lys Thr Cys His Leu Val Arg Tyr Trp lie Ser Ala Phe Pro Ala 
195 200 205 

GAG TTT GAC TTG AAC CCG GAG TTG GOT GAG GAG ATC AAG GAG CTG AAG 671 

Glu Phe Asp Leu Asn Pro Glu Leu Ala Glu Gin lie Lys Glu Leu Lys 
210 215 220 

GCT CTG CTA GAC CAA GAA GGG AAC CGA CGG CAC AGC AGC CTA ATC GAC 719 

Ala Leu Leu Asp Gin Glu Gly Asn Arg Arg His Ser Ser Leu lie Asp 
225 230 235 

ATA GAC AGC GTC CCT ACC TAC AAG TGG AAG CGG GAG GTG ACT CAG CGG 7 67 

lie Asp Ser Val Pro Thr Tyr Lys Trp Lys Arg Gin Val Thr Gin Arg 

240 245 250 255 

AAC CCT GTG GGA CAG AAA AAG CGC AAG ATG TCC CTG TTG TTT GAC CAC 815 

Asn Pro Val Gly Gin Lys Lys Arg Lys Met Ser Leu Leu Phe Asp His 

260 265 270 

CTG GAG CCC ATG GAG CTG GCG GAG CAT CTC ACC TAC TTG GAG TAT CGC 8 63 

Leu Glu Pro Met Glu Leu Ala Glu His Leu Thr Tyr Leu Glu Tyr Arg 
275 280 285 

TCC TTC TGC AAG ATC CTG TTT CAG GAC TAT CAC AGT TTC GTG ACT CAT 911 

Ser Phe Cys Lys lie Leu Phe Gin Asp Tyr His Ser Phe Val Thr His 
290 295 300 

GGC TGC ACT GTG GAC AAC CCC GTC CTG GAG CGG TTC ATC TCC CTC TTC 9 59 

Gly Cys Thr Val Asp Asn Pro Val Leu Glu Arg Phe lie Ser Leu Phe 
305 310 315 

AAC AGC GTC TCA CAG TGG GTG CAG CTC ATG ATC CTC AGC AAA CCC AC A 10 07 

Asn Ser Val Ser Gin Trp Val Gin Leu Met lie Leu Ser Lys Pro Thr 

320 325 330 335 

GCC CCG CAG CGG GCC CTG GTC ATC AC A CAC TTT GTC CAC GTG GCG GAG 10 55 

Ala Pro Gin Arg Ala Leu Val lie Thr His Phe Val His Val Ala Glu 

340 345 350 

AAG CTG CTA CAG CTG CAG AAC TTC AAC ACG CTG ATG GCA GTG GTC GGG 1103 

Lys Leu Leu Gin Leu Gin Asn Phe Asn Thr Leu Met Ala Val Val Gly 
355 360 365 

GGC CTG AGC CAC AGC TCC ATC TCC CGC CTC AAG GAG ACC CAC AGC CAC 1151 

Gly Leu Ser His Ser Ser lie Ser Arg Leu Lys Glu Thr His Ser His 
370 375 380 

GTT AGC CCT GAG ACC ATC AAG CTC TGG GAG GGT CTC ACG GAA CTA GTG 1199 

Val Ser Pro Glu Thr lie Lys Leu Trp Glu Gly Leu Thr Glu Leu Val 
385 390 395 

ACG GCG AC A GGC AAC TAT GGC AAC TAC CGG CGT CGG CTG GCA GCC TGT 1247 

Thr Ala Thr Gly Asn Tyr Gly Asn Tyr Arg Arg Arg Leu Ala . Ala Cys 

400 405 410 415 

GTG GGC TTC CGC TTC CCG ATC CTG GGT GTG CAC CTC AAG GAC CTG GTG 12 9 5 

Val Gly Phe Arg Phe Pro lie Leu Gly Val His Leu Lys Asp Leu Val 

420 425 430 

GCC CTG CAG CTG GCA CTG CCT GAC TGG CTG GAC CCA GCC CGG ACC CGG 134 3 

Ala Leu Gin Leu Ala Leu Pro Asp Trp Leu Asp Pro Ala Arg Thr Arg 
435 440 445 

CTC AAC GGG GCC AAG ATG AAG CAG CTC TTT AGC ATC CTG GAG GAG CTG 1391 

Leu Asn Gly Ala Lys Met Lys Gin Leu Phe Ser lie Leu Glu Glu Leu 
450 455 460 
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GCC ATG GTG ACC AGC CTG CGG CCA CCA GTA CAG GCC AAC CCC GAC CTG 143 9 

Ala Met Val Thr Ser Leu Arg Pro Pro Val Gin Ala Asn Pro Asp Leu 
465 470 475 

CTG AGC CTG CTC ACG GTG TCT CTG GAT CAG TAT CAG ACG GAG GAT GAG 14 87 

Leu Ser Leu Leu Thr Val Ser Leu Asp Gin Tyr Gin Thr Glu Asp Glu 
480 485 490 495 

CTG TAC CAG CTG TCC CTG CAG CGG GAG CCG CGC TCC AAG TCC TCG CCA 153 5 

Leu Tyr Gin Leu Ser Leu Gin Arg Glu Pro Arg Ser Lys Ser Ser Pro 
500 505 510 

ACC AGC CCC ACG AGT TGC ACC CCA CCA CCC CGG CCC CCG GTA CTG GAG 1583 
Thr Ser Pro Thr Ser Cys Thr Pro Pro Pro Arg Pro Pro Val Leu Glu 
515 520 525 

GAG TGG ACC TCG GCT GCC AAA CCC AAG CTG GAT CAG GCC CTC GTG GTG 1631 
Glu Trp Thr Ser Ala Ala Lys Pro Lys Leu Asp Gin Ala Leu Val Val 
530 535 540 

GAG CAC ATC GAG AAG ATG GTG GAG TCT GTG TTC CGG AAC TTT GAC GTC 1679 
Glu His lie Glu Lys Met Val Glu Ser Val Phe Arg Asn Phe Asp Val 
545 550 555 

GAT GGG GAT GGC CAC ATC TCA CAG GAA GAA TTC CAG ATC ATC CGT GGG 1727 
Asp Gly Asp Gly His lie Ser Gin Glu Glu Phe Gin lie lie Arg Gly 
560 565 570 575 

AAC TTC CCT TAC CTC AGC GCC TTT GGG GAC CTC GAC CAG AAC CAG GAT 17 7 5 

Asn Phe Pro Tyr Leu Ser Ala Phe Gly Asp Leu Asp Gin Asn Gin Asp 
580 585 590 

GGC TGC ATC AGC AGG GAG GAG ATG GTT TCC TAT TTC CTG CGC TCC AGC 1823 
Gly Cys lie Ser Arg Glu Glu Met Val Ser Tyr Phe Leu Arg Ser Ser 
595 600 605 

TCT GTG TTG GGG GGG CGC ATG GGC TTC GTA CAC AAC TTC CAG GAG AGC 1871 
Ser Val Leu Gly Gly Arg Met Gly Phe Val His Asn Phe Gin Glu Ser 
610 615 620 

AAC TCC TTG CGC CCC GTC GCC TGC CGC CAC TGC AAA GCC CTG ATC CTG ■ 1919 

Asn Ser Leu Arg Pro Val Ala Cys Arg His Cys Lys Ala Leu lie Leu 
625 630 635 

GGC ATC TAC AAG CAG GGC CTC AAA TGC CGA GCC TGT GGA GTG AAC TGC 19 67 

Gly lie Tyr Lys Gin Gly Leu Lys Cys Arg Ala Cys Gly Val Asn Cys 
640 645 650 655 

CAC AAG CAG TGC AAG GAT CGC CTG TCA GTT GAG TGT CGG CGC AGG GCC 2015 
His Lys Gin Cys Lys Asp Arg Leu Ser Val Glu Cys Arg Arg Arg Ala 
660 665 670 

CAG AGT GTG AGC CTG GAG GGG TCT GCA CCC TCA CCC TCA CCC ATG CAC 2 063 

Gin Ser Val Ser Leu Glu Gly Ser Ala Pro Ser Pro Ser Pro Met His 
675 680 685 

AGC CAC CAT CAC CGC GCC TTC AGC TTC TCT CTG CCC CGC CCT GGC AGG 2111 
Ser His His His Arg Ala Phe Ser Phe Ser Leu Pro Arg Pro Gly Arg 
690 695 700 

CGA GGC TCC AGG CCT CCA GAG ATC CGT GAG GAG GAG GTA CAG ACG GTG 2159 
Arg Gly Ser Arg Pro Pro Glu lie Arg Glu Glu Glu Val Gin Thr Val 
705 710 715 

GAG GAT GGG GTG TTT GAC ATC CAC TTG TA ATAGATGCTG TGGTTGGATC 2208 
Glu Asp Gly Val Phe Asp lie His Leu 
720 725 
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AAGGACTCAT TCCTGCCTTG GAGAAAATAC TTCAACCAGA GCAGGGAGCC . TGGGGGTGTC "2268 

GGGGCAGGAG GCTGGGGATG GGGGTGGGAT ATGAGGGTGG CATGCAGCTG AGGGCAGGGC 232 8 

CAGGGCTGGT GTCCCTAAGG TTGTACAGAC TCTTGTGAAT ATTTGTATTT TCCAGATGGA 23 88 

ATAAAAAGGC CCGTGTAATT AACCTTC 2415 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 728 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

lie Ser Phe Leu Ala Pro His Arg Ser Leu Ser Pro Lys Tyr Ser His 
15 10 15 

Leu Val Leu Ala His Pro Pro Asp Tyr Leu Lys Asp Gin Leu Ser Pro 
20 25 30 

Arg Pro Arg Pro Pro Leu Gly Leu Cys His Pro Leu Pro Ala Gly Arg 
35 40 45 

Arg Pro Val Pro Gly Arg Val Ser Pro Met Gly Thr Gin Arg Leu Cys 
50 55 60 

Gly Arg Gly Thr Gin Gly Trp Pro Gly Ser Ser Glu Gin His Val Gin 
65 70 75 80 

Glu Ala Thr Ser Ser Ala Gly Leu His Ser Gly Val Asp Glu Leu Gly 
85 90 95 

Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu Arg Ser Leu Gly Pro 
100 105 110 

Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr Leu Asp Leu Asp Lys 
115 120 125 

Gly Cys Thr Val Glu Glu Leu Leu Arg Gly Cys lie Glu Ala Phe Asp 
130 135 140 

Asp Ser Gly Lys Val Arg Asp Pro Gin Leu Val Arg Met Phe Leu Met 
145 150 155 160 

Met His Pro Trp Tyr lie Pro Ser Ser Gin Leu Ala Ala Lys Leu Leu 
165 170 175 

His lie Tyr Gin Gin Ser Arg Lys Asp Asn Ser Asn Ser Leu Gin Val 
180 185 190 

Lys Thr Cys His Leu Val Arg Tyr Trp lie Ser Ala Phe Pro Ala Glu 
195 200 205 

Phe Asp Leu Asn Pro Glu Leu Ala Glu Gin lie Lys Glu Leu Lys Ala 
210 215 220 

Leu Leu Asp Gin Glu Gly Asn Arg Arg His Ser Ser Leu lie Asp lie 
225 230 235 240 

Asp Ser Val Pro Thr Tyr Lys Trp Lys Arg Gin Val Thr Gin Arg Asn 
245 250 255 
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Pro Val Gly Gin Lys Lys Arg Lys Met Ser Leu Leu Phe Asp His Leu 
260 265 270 

Glu Pro Met Glu Leu Ala Glu His Leu Thr Tyr Leu Glu Tyr Arg Ser 
275 280 285 

Phe Cys Lys lie Leu Phe Gin Asp Tyr His Ser Phe Val Thr His Gly 
290 295 300 

Cys Thr Val Asp Asn Pro Val Leu Glu Arg Phe lie Ser Leu Phe Asn 
305 310 315 320 

Ser Val Ser Gin Trp Val Gin Leu Met lie Leu Ser Lys Pro Thr Ala 
325 330 335 

Pro Gin Arg Ala Leu Val lie Thr His Phe Val His Val Ala Glu Lys 
340 345 350 

Leu Leu Gin Leu Gin Asn Phe Asn Thr Leu Met Ala Val Val Gly Gly 
355 360 365 

Leu Ser His Ser Ser lie Ser Arg Leu Lys Glu Thr His Ser His Val 
370 375 380 

Ser Pro Glu Thr lie Lys Leu Trp Glu Gly Leu Thr Glu Leu Val Thr 
385 390 395 400 

Ala Thr Gly Asn Tyr Gly Asn Tyr Arg Arg Arg Leu Ala Ala Cys Val 
405 410 415 

Gly Phe Arg Phe Pro lie Leu Gly Val His Leu Lys Asp Leu Val Ala 
420 425 430 

Leu Gin Leu Ala Leu Pro Asp Trp Leu Asp Pro Ala Arg Thr Arg Leu 
435 440 445 

Asn Gly Ala Lys Met Lys Gin Leu Phe Ser lie Leu Glu Glu Leu Ala 
450 455 460 

Met Val Thr Ser Leu Arg Pro Pro Val Gin Ala Asn Pro Asp Leu Leu 
465 470 475 480 

Ser Leu Leu Thr Val Ser Leu Asp Gin Tyr Gin Thr Glu Asp Glu Leu 
485 490 495 

Tyr Gin Leu Ser Leu Gin Arg Glu Pro Arg Ser Lys Ser Ser Pro Thr 
500 505 510 

Ser Pro Thr Ser Cys Thr Pro Pro Pro Arg Pro Pro Val Leu Glu Glu 
515 520 525 

Trp Thr Ser Ala Ala Lys Pro Lys Leu Asp Gin Ala Leu Val Val Glu 
530 535 540 

His lie Glu Lys Met Val Glu Ser Val Phe Arg Asn Phe Asp Val Asp 
545 550 555 560 

Gly Asp Gly His He Ser Gin Glu Glu Phe Gin He He Arg Gly Asn 
565 570 575 

Phe Pro Tyr Leu Ser Ala Phe Gly Asp Leu Asp Gin Asn Gin Asp Gly 
580 585 590 

Cys He Ser Arg Glu Glu Met Val Ser Tyr Phe Leu Arg Ser Ser Ser 
595 600 605 

Val Leu Gly Gly Arg Met Gly Phe Val His Asn Phe Gin Glu Ser Asn 
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Gly 
640 

His 



Gin 



Ser 



Arg 



Glu 
720 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 17 0.. 3 00 



610 

Ser Leu Arg 
625 

lie Tyr Lys 

Lys Gin Cys 

Ser Val Ser 
675 

His His His 
690 

Gly Ser Arg 
705 

Asp Gly Val 



Pro Val Ala 
630 



Gin Gly Leu 
645 

Lys Asp Arg 
660 

Leu Glu Gly 



Arg Ala Phe 



Pro Pro Glu 
710 

Phe Asp lie 
725 



615 

Cys Arg His 



Lys Cys Arg 



Leu Ser Val 
665 

Ser Ala Pro 
680 

Ser Phe Ser 
695 

lie Arg Glu 
His Leu 



620 



Cys Lys Ala 
635 

Ala Cys Gly 
650 

Glu Cys Arg 



Ser Pro Ser 



Leu Pro Arg 
700 

Glu Glu Val 
715 



Leu lie Leu 



Val Asn Cys 
655 

Arg Arg Ala 
670 

Pro Met His 
685 

Pro Gly Arg 
Gin Thr Val 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CGATTTCATT CCTCGCTCCC CACAGGTCCC TCTCCCCAAA ATATTCCCAT CTTGTCCTAG 6 0 

CCCATCCCCC AGACTATCTC AAGGACCAGC TGTCCCCACG CCCCCGACCT CCACTAGGCC 12 0 

TGTGCCACCC GCTGCCTGCA GGAAGACGCC CGGTCCCGGG CCGGGTTAG CCC CAT 17 5 

Pro His 
1 

GGG AAC GGG GTT CGG TCC GAG CCC GGT GGG AGG CTC CCG GAG CGC AGC 223 
Gly Asn Gly Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu Arg Ser 
5 10 15 

CTG GGC CCA GCC CAC . CCC GCG CCG GCG GCC ATG GCA GGC ACC CTG GAC 271 
Leu Gly Pro Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr Leu Asp 
20 25 30 

CTG GAC AAG GGC TGC ACG GTG GAG GAG CT 3 00 

Leu Asp Lys Gly Cys Thr Val Glu Glu Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Pro His Gly Asn Gly Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu 
15 10 15 

Arg Ser Leu Gly Pro Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr 
20 25 30 

Leu Asp Leu Asp Lys Gly Cys Thr Val Glu Glu Leu 
35 40 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
GGGATCCCCC TGGTC 1^ 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
GAATTCGGCA CGAGCCGACG G 21 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION : 1 , . 78 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

ATG GAG CAG AAG CTG ATC TCC GAG GAG GAC CTG CCC GGG GCA GCT GGA 48 
Met Glu Gin Lys Leu lie Ser Glu Glu Asp Leu Pro Gly Ala Ala Gly 
15 10 15 

TCC GCA GCC CAC CCC GGG CCG GCG GCC ATG 78 
Ser Ala Ala His Pro Gly Pro Ala Ala Met 
20 25 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Met Glu Gin Lys Leu lie Ser Glu Glu Asp Leu Pro Gly Ala Ala Gly 
15 10 15 

Ser Ala Ala His Pro Gly Pro Ala Ala Met 
20 25 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) N/vME/KEY: CDS 

(B) LOCATION : 1 . . 33 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

GGA TCC GCA GCC CAC CCC GGG CCG GCG GCC ATG 3 3 

Gly Ser Ala Ala His Pro Gly Pro Ala Ala Met 
15 10 



(2) INFORMATION FOR SEQ ID NO : 1 0 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Gly Ser Ala Ala His Pro Gly Pro Ala Ala Met 
15 10 



(2) INFORMATION FOR SEQ ID NO : 1 1 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGACAAAGTG TGTGATGAAC C 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRI PTION : SEQ ID NO : 12 : 
CTCATCCTCC GTCTGATACT G 21 



(2) INFORMATION FOR SEQ ID NO : 1 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



GTAGATGTGG ATCAGCTTGG 2 0 

(2) INFORMATION FOR SEQ ID NO : 1 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
AGGTGGAGAA TGGTCAAGG 19 
(2) INFORMATION FOR SEQ ID NO : 1 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GTCATAGTCT GTCTCCTACT 2 0 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
ACATAGACAG CGTGCCTACC 2 0 

(2) INFORMATION FOR SEQ ID NO : 1 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
TACAACCTTA GGGACACCAG 2 0 

(2) INFORMATION FOR SEQ ID NO : 1 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 
TGCTGAGCCT GCTCACGGTG 2 0 

(2) INFORMATION FOR SEQ ID NO : 1 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CAAGtGAACA GCACGTCC 18 

(2) INFORMATION FOR SEQ ID NO : 2 0 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GACTATCTCA AGGACCAGCT G 
(2) INFORMATION FOR SEQ ID NO : 2 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GGTTCGGTCC GAGCCCGG 

(2) INFORMATION FOR SEQ ID NO : 2 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GGAGCGATAC TCCAAGTAGG T 
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FIGURE 2 



CG ATT TCA TTC CTC OCT CCC CAC AGO TCC CTC TCC CCA AAA TAT TCC 47 
He Ser Phe Leu Ala Pro His Arg Ser Leu Ser Pro Lys Tyr Ser 
1 5 10 15 

CAT CTT GTC CTA GCC CAT CCC CCA GAC TAT CTC AAG GAC CAG CTG TCC 9 5 

His Leu Val Leu Ala His Pro Pro Asp Tyr Leu Lys Asp Gin Leu Ser 
20 25 30 

CCA CGC CCC CGA CCT CCA CTA GGC CTG TGC CAC CCG CTG CCT GCA GGA 14 3 

Pro Arg Pro Arg Pro Pro Leu Gly Leu Cys His Pro Leu Pro Ala Gly 
35 40 45 

AGA CGC CCG GTC CCG GGC CGG GTT AGC CCC ATG GGA ACG CAG CGC CTG 191 
Arg Arg Pro Val Pro Gly Arg Val Ser Pro Met Gly Thr Gin Arg Leu 
50 55 60 

TGT GGC CGC GGG ACT CAA GGC TGG CCT GGC TCA ACT GAA CAG CAC GTC 23 9 

Cvs Gly Arg Gly Thr Gin Gly Trp Pro Gly Ser Ser Glu Gin His Val 
65 70 75 

CAG GAG GCG ACC TCG TCC GCG GGT TTG CAT TCT GGG GTG GAC GAG CTG 2 87 

Gin Glu Ala Thr Ser Ser Ala Gly Leu His Ser Gly Val Asp Glu Leu 
80 85 90 95 

GGG GTT CGG TCC GAG CCC GGT GGG AGG CTC CCG GAG CGC AGC CTG GGC 3 35 

Glv Val Arg Ser Glu Pro Gly Gly Arg Leu Pro Glu Arg Ser Leu Gly 
100 105 110 



CCA GCC CAC CCC GCG CCG GCG GCC ATG GCA GGC ACC CTG GAC CTG GAC 
Pro Ala His Pro Ala Pro Ala Ala Met Ala Gly Thr Leu Asp Leu Asp 
115 120 125 



383 



AAG GGC TGC ACG GTG GAG GAG CTG CTC CGC GGG TGC ATC GAA GCC TTC 431 
Lys Gly Cys Thr Val Glu Glu Leu Leu Arg Gly Cys He Glu Ala Phe 
130 135 140 

GAT GAC TCC GGG AAG GTG CGG GAC CCG CAG CTG GTG CGC ATG TTC CTC 479 
Asp Asp Ser Gly Lys Val Arg Asp Pro Gin Leu Val Arg Met Phe Leu 
145 150 155 

ATG ATG CAC CCC TGG TAG ATC CCC TCC TCT CAG CTG GCG GCC AAG CTG 527 
Met Men His Pro Trp Tyr He Pro Ser Ser Gin Leu Ala Ala Lys Leu 
160 165 170 175 

CTC CAC ATC TAC CAA CAA TCC CGG AAG GAC AAC TCC AAT TCC CTG CAG 57 5 

Leu His He Tyr Gin Gin Ser Arg Lys Asp Asn Ser Asn Ser Leu Gin 
180 185 190 

GTG AAA ACG TGC CAC CTG GTC AGG TAC TGG ATC TCC GCC TTC CCA GCG 62 3 

Val Lys Thr Cys His Leu Val Arg Tyr Trp He Ser Ala Phe Pro Ala 
195 200 205 

GAG TTT GAC TTG AAC CCG GAG TTG GCT GAG CAG ATC AAG GAG CTG AAG 671 
Glu Phe Asp Leu Asn Pro Glu Leu Ala Glu Gin He Lys Glu Leu Lys 
210 215 220 

GCT CTG CTA GAC CAA GAA GGG AAC CGA CGG CAC AGC AGC CTA ATC GAC 719 
Ala Leu Leu Asp Gin Glu Gly Asn Arg Arg His Ser Ser Leu He Asp 
225 230 235 

ATA GAC AGC GTC CCT ACC TAC AAG TGG AAG CGG CAG GTG, ACT CAG CGG 7 67 

He Asp Ser Val Pro Thr Tyr Lys Trp Lys Arg Gin Val Thr Gin Arg 
240 245 250 255 

AAC CCT GTG GGA CAG AAA AAG CGC AAG ATG TCC CTG TTG TTT GAC CAC 815 
Asn Pro Val Gly Gin Lys Lys Arg Lys Met Ser Leu Leu Phe Asp His 



FIGURE 2 (Cent, II) 



260 265 270 

CTG GAG CCC ATG GAG CTG GCG GAG CAT CTC ACC TAG TTG GAG TAT CGC 863 

Leu Glu Pro Met Glu Leu Ala Glu His Leu Thr Tyr Leu Glu Tyr Arg 
275 280 285 

TCC TTC TGC AAG ATC CTG TTT CAG GAC TAT CAC AGT TTC GTG ACT CAT 911 

Ser Phe Cys Lys lie Leu Phe Gin Asp Tyr His Ser Phe Val Thr His 
290 295 300 

GGC TGC ACT GTG GAC AAC CCC GTC CTG GAG CGG TTC ATC TCC CTC TTC 959 

Gly Cys Thr Val Asp- Asn Pro Val Leu Glu Arg Phe lie Ser Leu Phe 

305 310 315 

AAC AGC GTC TCA CAG TGG GTG CAG CTC ATG ATC CTC AGC AAA CCC AC A 1007 

Asn Ser Val Ser Gin Trp Val Gin Leu Met lie Leu Ser Lys Pro Thr 
320 325 330 335 

GCC CCG CAG CGG GCC CTG GTC ATC ACA CAC TTT GTC CAC GTG GCG GAG 10 5 5 

Ala Pro Gin Arg Ala Leu Val lie Thr His Phe Val His Val Ala Glu 
340 345 350 

AAG CTG CTA CAG CTG CAG AAC TTC AAC ACG CTG ATG GCA GTG GTC GGG 1103 

Lys Leu Leu Gin Leu Gin Asn Phe Asn Thr Leu Met Ala Val Val Gly 
355 360 365 

GGC CTG AGC CAC AGC TCC ATC TCC CGC CTC AAG GAG ACC CAC AGC CAC 1151 

Gly Leu Ser His Ser Ser lie Ser Arg Leu Lys Glu Thr His Ser His 
370 375 380 

GTT AGC CCT GAG ACC ATC AAG CTC TGG GAG GGT CTC ACG GAA CTA GTG 119 9 

Val Ser Pro Glu Thr lie Lys Leu Trp Glu Gly Leu Thr Glu Leu Val 

385 390 395 

ACG GCG ACA GGC AAC TAT GGC AAC TAC CGG CGT CGG CTG GCA GCC TGT 12 47 

Thr Ala Thr Gly Asn Tyr Gly Asn Tyr Arg Arg Arg Leu Ala Ala Cys 
400 405 410 415 

GTG GGC TTC CGC TTC CCG ATC CTG GGT GTG CAC CTC AAG GAC CTG GTG 12 9 5 

Val Gly Phe Arg Phe Pro lie Leu Gly Val His Leu Lys Asp Leu Val 
420 425 430 

GCC CTG CAG CTG GCA CTG CCT GAC TGG CTG GAC CCA GCC CGG ACC CGG 134 3 

Ala Leu Gin Leu Ala Leu Pro Asp Trp Leu Asp Pro Ala Arg Thr Arg 
435 440 445 

CTC AAC GGG GCC AAG ATG AAG CAG CTC TTT AGC ATC CTG GAG GAG CTG 13 9 1 

Leu Asn Gly Ala Lys Met Lys Gin Leu Phe Ser lie Leu Glu Glu Leu 
450 455 460 

GCC ATG GTG ACC AGC CTG CGG CCA CCA GTA CAG GCC AAC CCC GAC CTG 14 3 9 

Ala Met Val Thr Ser Leu Arg Pro Pro Val Gin Ala Asn Pro Asp Leu 

465 470 475 

CTG AGC CTG CTC ACG GTG TCT CTG GAT CAG TAT CAG ACG GAG GAT GAG 14 87 

Leu Ser Leu Leu Thr Val Ser Leu Asp Gin Tyr Gin Thr Glu Asp Glu 

480 485 . 490 495 

CTG TAC CAG CTG TCC CTG CAG CGG GAG CCG CGC TCC AAG TCC TCG CCA 153 5 

Leu Tyr Gin Leu Ser Leu Gin Arg Glu Pro Arg Ser Lys Ser Ser Pro 
500 505 510 

ACC AGC CCC ACG AGT TGC ACC CCA CCA CCC CGG CCC CCG GTA CTG GAG 15 83 

Thr Ser Pro Thr Ser Cys Thr Pro Pro Pro Arg Pro Pro Val Leu Glu 
515 520 525 

GAG TGG ACC TCG GCT GCC AAA CCC AAG CTG GAT CAG GCC CTC GTG GTG 1631 

Glu Trp Thr Ser Ala Ala Lys Pro Lys Leu Asp Gin Ala Leu Val Val 
530 ^ 535 540 

GAG CAC ATC GAG AAG ATG GTG GAG TCT GTG TTC CGG AAC TTT GAC GTC 167 9 

Glu His He Glu Lys Met Val Glu Ser Val Phe Arg Asn Phe Asp Val 



FIGURE 2 (Cont. Ill) 

545 550 555 

GAT GGG GAT GGC CAC ATC TCA CAG GAA GAA TTC GAG ATC ATC CGT GGG 17 27 

Asp Gly Asp Gly His lie Ser Gin Glu Glu Phe Gin lie lie Arg Gly 
560 565 570 575 

AAC TTC OCT TAG CTC AGC GCC TTT GGG GAG CTC GAC CAG AAC CAG GAT 17 7 5 

Asn Phe Pro Tyr Leu Ser Ala Phe Gly Asp Leu Asp Gin Asn Gin Asp 
580 585 590 

GGC TGC ATC AGC AGG GAG GAG ATG GTT TCC TAT TTC CTG CGC TCC AGC 182 3 

Gly Cys lie Ser Arg Glu Glu Met Val Ser Tyr Phe Leu Arg Ser Ser 
595 600 605 

TCT GTG TTG GGG GGG CGC ATG GGC TTC GTA CAC AAC TTC CAG GAG AGC 1871 
Ser Val Leu Gly Gly Arg Met Gly Phe Val His Asn Phe Gin Glu Ser 
610 615 620 

AAC TCC TTG CGC CCC GTC GCC TGC CGC CAC TGC AAA GCC CTG ATC CTG 1919 
Asn Ser Leu Arg Pro Val Ala Cys Arg His Cys Lys Ala Leu lie Leu 
625 630 635 

GGC ATC TAC AAG CAG GGC CTC AAA TGC CGA GCC TGT GGA GTG AAC TGC 19 67 

Gly lie Tyr Lys Gin Gly Leu Lys Cys Arg Ala Cys Gly Val Asn Cys 
640 645 650 655 

CAC AAG CAG TGC AAG GAT CGC CTG TCA GTT GAG TGT CGG CGC AGG GCC 2 015 

His Lys Gin Cys Lys Asp Arg Leu Ser Val Glu Cys Arg Arg Arg Ala 
660 665 670 

CAG AGT GTG AGC CTG GAG GGG TCT GCA CCC TCA CCC TCA CCC ATG CAC 2 063 

Gin Ser Val Ser Leu Glu Gly Ser Ala Pro Ser Pro Ser Pro Met His 
675 680 685 

AGC CAC CAT CAC CGC GCC TTC AGC TTC TCT CTG CCC CGC CCT GGC AGG 2111 
Ser His His His Arg Ala Phe Ser Phe Ser Leu Pro Arg Pro Gly Arg 
690 695 700 

CGA GGC TCC AGG CCT CCA GAG ATC CGT GAG GAG GAG GTA CAG ACG GTG 2159 
Arg Gly Ser Arg Pro Pro Glu lie Arg Glu Glu Glu Val Gin Thr Val 
705 710 715 

GAG GAT GGG GTG TTT GAC ATC CAC TTG TA ATAGATGCTG TGGTTGGATC 2 2 08 
Glu Asp Gly Val Phe Asp lie His Leu 
720 725 

AAGGACTCAT TCCTGCCTTG GAGAAAATAC TTCAACCAGA GCAGGGAGCC TGGGGGTGTC 2 2 68 

GGGGCAGGAG GCTGGGGATG GGGGTGGGAT ATGAGGGTGG CATGCAGCTG AGGGCAGGGC 2 32 8 

CAGGGCTGGT GTCCCTAAGG TTGTACAGAC TCTTGTGAAT ATTTGTATTT TCCAGATGGA 2 3 88 

ATAAAAAGGC CCGTGTAATT AACCTTCA 2 416 



FIGURE 2a (cont. I) 



MCG7 - Cloning of a novel human gene that encodes a guanine exchange factor 



CGATTTCATTCCTCGCTCCCCACAGGTCCCTCTCCCCAAAATATTCCCATCTTGTCCTAG 6 0 
ISFLAPHRS LSPKYSHLVL 19 
CCCATCCCCCAGACTATCTCAAGGACCAGCTGTCCCCACGCCCCCGACCTCCACTAGGCC 12 0 
AHPPDYLKDQLSPRPRPPLG 39 
TGTGCCACCCGCTGCCTGCAGGAAGACGCCCGGTCCCGGGCCiGGGTTAGCCCCATGGGAA 18 0 
LCHPLPAGRRPVPGRVSPMG 59 
CGcagcgcctgtgtggccgcgggactcaaggctggcctggctcaagtgaacagcacgtcc 24 0 
TQRLCGRGTQGWPGSSEQHV 79 
aggaggcgacctcgtccgcgggtttgcattctggggtggacgagctggGGGTTCGGTCCG 30 0 
QEATSSAGLHSGVDELGVRS 99 
AGCCCGGTGGGAGGCTCCCGGAGCGCAGCCTGGGCCCAGCCCACCCCGCGCCGGCGGCCA 3 6 0 
EPGGRLPERSLGPAHPAPAA 119 
TGGCAGGCACCCTGGACCTGGACAAGGGCTGCACGGTGGAGGAGCTGCTCCGCGGGTGCA 4 2 0 
MAGTLDLDKG.CTVEELLRGC 139 
TCGAAGCCTTCGATGACTCCGGGAAGGTGCGGGACCCGCAGCTGGTGCGCATGTTCCTCA 4 8 0 
I EAFDDSGKVRDPQLVRMFL 159 
TGATGCACCCCTGGTACATCCCCTCCTCTCAGCTGGCGGCCAAGCTGCTCCACATCTACC 5 4 0 
MMHPWYIPSSQLAAKLLHIY 179 
AACAATCCCGGAAGGACAACTCCAATTCCCTGCAGGTGAAAACGTGCCACCTGGTCAGGT 6 00 
QQSRKDNSNSLQVKTCHLVR 199 
ACTGGATCTCCGCCTTCCCAGCGGAGTTTGACTTGAACCCGGAGTTGGCTGAGCAGATCA 6 60 
YWISAFPAEFDLNPELAEQI 219 
AGGAGCTGAAGGCTCTGCTAGACCAAGAAGGGAACCGACGGCACAGCAGCCTAATCGACA 72 0 
KELKALLDQEGNRRHSSLID 239 
TAGACAGCGTCCCTACCTACAAGTGGAAGCGGCAGGTGACTCAGCGGAACCCTGTGGGAC 7 80 
IDSVPTYKWKRQVTQRNPVG 259 
AGAAAAAGCGCAAGATGTCCCTGTTGTTTG ACCACCTGGAGCCCATGGAGCTGGCGGAGC 84 0 
QKKRKMSLLFDHLEPMELAE 279 
ATCTCACCTACTTGGAGTATCGCTCCTTCTGCAAGATCCTGTTTCAGGACTATCACAGTT 900 
HLTYLEYRS FCKILFQDYHS 299 
TCGTGACTCATGGCTGCACTGTGGACAACCCCGTCCTGGAGCGGTTCATCTCCCTCTTCA 960 
FVTHGCTVDNPVLERFISLF 319 
ACAGCGTCTCACAGTGGGTGCAGCTCATGATCCTCAGCAAACCCAC A.GCCCCGCAGCGGG 102 0 
NSVSQWVQLMILSKPTAPQR 339 
CCCTGGTCATCACACACTTTGTCCACGTGGCGGAGAAGCTGCTACAGCTGCAGAACTTCA 108 0 
AL VI THFVHVAEKLLQLQNF 359 
ACACGCTGATGGCAGTGGTCGGGGGCCTGAGCCACAGCTCCATCTCCCGCCTCTAGGAGA 114 0 
NTLMAVVGGLSHSSI SRLKE 379 
CCCACAGCCACGTTAGCCCTGAG ACC ATC AAGCTCTGGGAGGGTCTC ACGGAACTAGTGA 1200 
THSHVSPETI KLWEGLTELV 399 
CGGCGACAGGCAACTATGGCAACTACCGGCGTCGGCTGGCAGCCTGTGTGGGCTTCCGCT 126 0 
TATGNYGNYRRRLAACVGFR 419 
TCCCGATCCTGGGTGTGCACCTC AAGG ACCTGGTGGCCCTGCAGCTGGCACTGCCTGACT 13 20 
FP I L GVHLKDL.VALQLALPD 439 
GGCTGGACCCAGCCCGGACCCGGCTCAACGGGGCCAAGATGAAGCAGCTCTTTAGCATCC 13 80 
WLDPARTR LNGAKMKQLFSI 459 
TGGAGGAGCTGGCCATGGTGACCAGCCTGCGGCCACCAGTACAGGCCAACCCCGACCTGC 144 0 
LEELAMVTSLRPPVQANPDL 479 
TGAGCCTGCTCACGGTGTCTCTGG ATCAGT ATCAGACGG AGGATGAGCTGTACCAGCTGT 1500 
LSLLTVSLDQYQTEDELYQL 499 
CCCTGCAGCGGGAGCCGCGCTCCAAGTCCTCGCCAACCAGCCCCACGAGTTGCACCCCAC 1560 
SLQRE PRSKSSPTSPTSCTP 519 
CACCCCGGCCCCCGGTACTGGAGGAGTGG ACCTCGGCTGCCAAACCCAAGCTGGATCAGG 162 0 
PPRP PVLEEWTSAAKPKLDQ 539 
CCCTCGTGGTGGAGCACATCGAGAAGATGGTGGAGTCTGTGTTCCGGAACTTTGACGTCG 1680 



FIGURE 2a (cont. II) 



ALVVEHIEKMVESVFRNFDV 559 

ATGGGGATGGCCACATCTCACAGGAAGAATTCCAGATCATCCGTGGGAACTTCCCTTACC 1 74 0 

DGDGHISQEEFQIIRGNFPY 579 

TCAGCGCCTTTGGGGACCTCGACCAGAACCAGGATGGCTGCATCAGCAGGGAGGAGATGG 1800 

LSAFGDLDQNQDGCISREEM 599 

TTTCCTATTTCCTGCGCTCCAGCTCTGTGTTGGGGGGGCGCATGGGCTTCGTACACAACT 1860 

VSYFLRSSSVLGGRMGFVHN 619 

TCCAGGAGAGCAACTCCTTGCGCCCCGTCGCCTGCCGCCACTGCAAAGCCCTGATCCTGG 192 0 

^^NSLRPVACRHCKALIL 639 

GCATCTACAAGCAGGGCCTCAAATGCCGAGCCTGTGGAGTGAACTGCCACAAGCAGTGCA 1980 

GIYKQGLKCR ACGVNCHKQC 659 

AGGATCGCCTGTCAGTTGAGTGTCGGCGCAGGGCCCAGAGTGTGAGCCTGGAGGGGTCTG 2040 

KDRLSVECRRRAQS-VSLEGS 679 

CACCCTCACCCTCACCCATGCACAGCCACCATCACCGCGCCTTCAGCTTCTCTCTGCCCC 2100 

APSPSPMHSHHHRAFSFSLP 599 

GCCCTGGCAGGCGAGGCTCCAGGCCTCCAGAGATCCGTGAGGAGGAGGTACAGACGGTGG 2160 

RPGRRGSRPPEIREEEVQTV 719 

AGGATGGGGTGTTTGACATCCACTTGTAATAGATGCTGTGGTTGGATCAAGGACTCATTC 2220 
EDGVFDIHL* 

CTGCCTTGGAGAAAATACTTCAACCAGAGCAGGGAGCCTGGGGGTGTCGGGGCAGGAGGC 2280 
TGGGGATGGGGGTGGGATATGAGGGTGGCATGCAGCTGAGGGCAGGGCCAGGGCTGGTGT 2340 
CCCTAAGGTTGTACAGACTCTTGTGAATATTTGTATrrrCCAGATGGAATAAAAAGGCCC 2400 
GTGTAATTAACCTTC ( A) 



Figure 2b 



CGATTTCATTCCTCGCTCCCCACAGGTCCCTCTCCCCAAAATATTCCCATCTTGTCCTAG 6 0 
CCCATCCCCCAGACTATCTCAAGGACCAGCTGTCCCCACGCCCCCGACCTCCACTAGGCC 12 0 
TGTGCCACCCGCTGCCTGCAGGAAGACGCCCGGTCCCGGGCCGGGTTAGCCCCATGGGAA ^8 0 ' 

* p h g n ; 

CGGGGTTCGGTCCGAGCCCGGTGGGAGGCTCCCGGAGCGCAGCCTGGGCCCAGCCCACCC-^L^ 
gvrsepggrlperslgpahp 

CGCGCCGGCGGCCATGGCAGGCACCCTGGACCTGGACAAGGGCTGCACGGTGGAGGAGCT-^^ 
a p a a MAG T L D LD KG C T VE E L 
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CGATTTCATT 
CCCATCCCCC 
TGTGCCACCC 
CX3CAGCGCCT 



CCTCGCTCCC 
AGACTATCTC 
GCTGCCTGCA 
-GTGTGGCCX3C 



AGGAGGCGAC CTCGTCCGCG 



CACAGGTCCC 
AAGGACCAGC 
GGAAGACGCC 
GGGACTCAAG 
***tcag** 
GGTTTGCATT 



TCTCCCCAAA 
TGTCCCCACG 
CGGTCCCGGG 
GCTGGCCTGG 
****ag**** 
CTGGGGTGGA 



ATATTCCCAT CTTGTCCTAG 60 
CCCCCGACCT CCACTAGGCC 120 
CCGGGTTAGC CCCATGGGAA 180 
CTCAAGTGAA CAGCACGTCC 240 
^.********* ***a*g***t;> 
CGAGCTGGGG GTTCGGTCCG 300 
acagg 
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mouse 



****** **** 



g*****t-**a **-*catt** 

AGCCCGGTGG GAGGCTCCCG 

***a*t.**** *******tga 

TGGCAGGCAC CCTGGACCTG 

****ga**** t*^ 
TCGAAGCCTT CGATGACTCC 

********** ^********t. 

TGATGCACCC CTGGTACATC 
****«***♦* ♦********a 

AACAATCCCG GAAGGACAAC 

*g**»***** ********** 

ACTGGATCTC CGCCTTCCCA 
********** g^«******** 

AGGAGCTGAA GGCTCTGCTA 
********** *******t** 

TAGACAGCGT 
*c**g**t** 



GAGCGCAGCC 
***t*t*a*t 
GACAAGGGCT 

GGGAAGGTGC 



CCCTCCTCTC 
**^* ****** 

TCCAATTCCC 
******** ^* 

GCGGAGTTTG 
it * g^* * * * *^* 

GACCAAGAAG 



***aa**aa* 
TGGGCCCAGC 
****t*t*** 
GCACGGTGGA 
****^***** 

GGGACCCGCA 
*a**t**a** 
AGCTGGCGGC 

******* 

TGCAGGTGAA 
*a***a**** 
ACTTGAACCC 
********** 

GGAACCGACG 
******* 



g**ct***** 
CXIACCCCXSCG 
** *_*t;g**a 
GGAGCTGCTC 
********** 

GCTGGTGCGC 
* ♦ *-^^*** *** 

CAAGCTGCTC 
g**a****** 
AACGTGCCAC 
******^*** 

GGAGTTGGCT 
^***^-.***** 

GCACAGCAGC 
********** 



**a**aat**> 

CCGGCGGCCA 360 
*****Q^****> 

CGCGGGTGCA 420 
**t**c**t*> 
ATGTTCCTCA 480 
*****^****> 

CACATCTACC 540 
* **t** * * t*> 
CTGGTCAGGT 600 
^*********> 

GAGCAGATCA 660 
**a*******> 
CTAATCGACA 720 
**^*******> 

730 



Figure 5 



CACGCCTCGGAAGGGAGGTTTGGGGTCGGTGGTTTCACAGTGAGTCTGTCTGA^^^ 60 
TGGTCGGAAACCGTTACCCGCTCTCCTAGGCCCGGCTAGTGGGGACCCCAACCGCCTGCG 12 0 

*ARLVGTPTAC> 
GCTGCCCCTCCCAAGTTCCTCCCTGTTGGCCAGGCATCCAGGTCTCCAGTCTCCGAGCTG 18 0 
GCPSQVPPCWPGIQVSSLRA> 
CGGAGAACCCACCGCCACATGCGGCTGCCCCTTTCCATTCGACCCTGTGGGGAGCCAGGC 24 0 
AENPPPHA AAPFHSTLWGAR> 
TTCCGGGGCCCCGTTCCTCCTGTGTGAACTGGGCCCCCCGCCCCCATTCCCAGACATCAA 3 0 0 
L.PGPRSSCVNWAPRPHSQTS> 
GGCCGCGTCTCCAGATAGCCACGATTTCATTCCTCGCTCCCCACAGGTCCCTCTCCCCAA 3 6 0 
RPRLQIATISFLAPHRSLSP> 
AATATTCCCATCTTGTCCTAGCCCATCC-CCAGACTATCTCAAGGACCAGCTGTCCCCAC 420 
KYSHLVLAHPPDYLKDQLSP> 
GCCCCCGACCTCCACTAGGCCTGTGCCACCCGCTGCCTGCAGGAAGACGCCCGGTCCCGG 480 
RPRPPLGLCHPLPAGRRPVP> 
GCCGGGTTAGCCCCATCGGAACGcagcgcctgtgtggccgcgggactcaaggctggcctg 54 o 



* p h g n 

GRVSPMGTQRLCGRGTQGWP> 
gctcaagtgaacagcacgtccaggaggcgacctcgtccgcgggtttgcattctggggtgg 600 
rsSEQHVQEATSSAGLHSGV> 
acgagctggGGGTTCGGTCCGAGCCCGGTGGGAGGCTCCCGGAGCGCAGCCTGGGCCCAG 660 
nELGVRSEPGGRLPERSLGP> 
CCCACCCCGCGCCGGCGGCCATGGCAGGCACCCTGGACCTGGACAAGGGCTGCACGGTGG 72 0 
AH PAP AAMAGTLDLDKGCTV> 



Figure 6 
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Figure 7 (Cont. I) 



Smal/Apal (both lost) 0.00 




.pal/Smal (both lost) 1.00 



Piasmid name: clone 16 in pGEX-3X 
Plasmid size: 6.00 kb 
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Figure 7 (Cont. II) 



EcoRI 0.00 




Plasmid name: clone 19 in pGEX-1 
Plasmid size: 6.00 kb 




Hindlll 2.50 



Plasmid name: clone 5 in pGEM-11zf 
Plasmid size: 5.50 kb 
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Figure 7 (Cont. IV) 




Piasmid name: clone 27 in pGEX-2T 
Plasmid size: 7.50 kb 



